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Abstract 



, 1 Trapezoidal words are words having at most n + 1 distinct factors of length n for every n > 0. They 

P^H therefore encompass finite Sturmian words. We give combinatorial characterizations of trapezoidal 

CZ3 words and exhibit a formula for their enumeration. We then separate trapezoidal words into two 

disjoint classes: open and closed. A trapezoidal word is closed if it has a factor that occurs only 
as a prefix and as a suffix; otherwise it is open. We investigate open and closed trapezoidal words, 
in relation with their special factors. We prove that Sturmian palindromes are closed trapezoidal 
cn words and that a closed trapezoidal word is a Sturmian palindrome if and only if its longest repeated 

^^ prefix is a palindrome. We also define a new class of words, semicentral words, and show that they 

are characterized by the property that they can be written as uxyu, for a central word u and two 
different letters x, y. Finally, we investigate the prefixes of the Fibonacci word with respect to the 
(^ property of being open or closed trapezoidal words, and show that the sequence of open and closed 

CN prefixes of the Fibonacci word follows the Fibonacci sequence. 



Keywords: Sturmian words, trapezoidal words, rich words, closed words, special factors, 
palindromes, enumerative formula. 



1. Introduction 

In combinatorics on words, the most famous class of infinite words is certainly that of Sturmian 
words. They code digital straight lines in the discrete plane having irrational slope, and are 



"Part of the results in this paper were presented by the third author at WORDS 2011, 8th International Conference 
on Words, 12-16 September 2011, Prague, Czech Republic |14| . 

* Corresponding author. 
Email addresses: michelangelo.bucciOutu.f i (Michelangelo Bucci), alessandro.deluca@unina.it 
(Alessandro De Luca), gabriele.fici@unice.fr (Gabriele Fici) 

Accepted for publication in Theoretical Computer Science January 22, 2013 



characterized by the property of having exactly n + 1 factors of length n, for every n > 0. 

It is well known (see |18) . Proposition 2.1.17) that a finite word w is a factor of some Sturniian 
word if and only if w is a binary balanced word, that is, for every letter a and every pair of factors 
of w of the same length, u and v, one has that u and v contain the same number of a's up to one, 
i.e., 

\\u\a - \v\a\ < 1- (1) 

Finite Sturmian words (finite factors of Sturmian words) have at most n + 1 factors of length 
n, for every n > 0. However, this property does not characterize them, as shown by the word 
w = aaabab, which is not Sturmian since the factors aaa and bab do not verify ([T]). 

The finite words defined by the property of having at most n + 1 factors of length n, for 
every n > 0, are called trapezoidal words. The name comes from the fact that the graph of the 
function counting the distinct factors of each length (usually called the factor complexity) defines, 
for trapezoidal words, an isosceles trapezoid. 

Trapezoidal words were defined by de Luca |8] by means of combinatorial parameters on the 
repeated factors of finite words. For a finite word w over a finite alphabet S, let Kyj be the minimal 
length of an unrepeated suffix of w and let Rm be the minimal length for which w does not contain 
right special factors, i.e., factors having occurrences followed by distinct letters, de Luca proved 
that for any word w of length \w\ one always has 

Rw + K^<\w\ (2) 

and studied the case in which the equality holds. This is in fact the case in which w is trapezoidal, 
since one can prove (see Proposition |2.8| later) that a word w is trapezoidal if and only if 



Rw + Kw = \w\. (3) 

Then, de Luca proved that finite Sturmian words verify (p]) and are therefore trapezoidal. 

The non-Sturmian trapezoidal words were later characterized by D'Alessandro [6j. Since a 
non-Sturmian word is not balanced, it must contain a pair of factors having the same length and 
not verifying (IT]). Such a pair is called a pathological pair. Let w be a non-Sturmian word and (/, g) 
its pathological pair of minimal length. D'Alessandro proved that w is trapezoidal if and only if 
w = pq for some p £ SujJ{{zf}*) and q £ Pref{{zg}*), where Zf is the reversal of the fractional 
root Zf of / and Zg is the fractional root of g. This characterization is quite involved and will be 
discussed in detail later. 

The characterization by D'Alessandro allows us to give an enumerative formula for trapezoidal 
words. Indeed, it is known that the number of Sturmian words of length n is equal to 

n 
S{n) = l + ^{n-i + l)(j){i) 
1=1 

(cf. |191 I17j). where (f) is the Euler totient function, i.e., (p{n) is the number of positive integers 
smaller than or equal to n and coprime with n. In Section [3] we prove that the number of non- 
Sturmian trapezoidal words of length n is 

L(n-4)/2j 

T{n)= Y^ 2{n-2i-3)(p{i + 2), 

1=0 
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and thus the number of trapezoidal words of length n is S{n) + T{n) (Theorem 3.2). 

We then distinguish trapezoidal words into two distinct classes, accordingly with the definition 
below. 

Definition 1.1. Let u; be a finite word over an alphabet S. We say that w is closed if it is empty 
or it has a factor (different from the word itself) occurring exactly twice in ti;, as a prefix and as a 
suffix, that is, with no internal occurrences. A word which is not closed is called open. 

For example, the words aabbaa and ababa are closed, whereas the word aabbaaa is open. 

The notion of closed word is closely related to the concept of complete return to a factor u in 
a word w, as considered in [15J. A complete return to u in w is any factor of w having exactly 
two occurrences of u, one as a prefix and one as a suffix. Hence w is closed if and only if it is a 
complete return to one of its factors; such a factor is clearly both the longest repeated prefix and 
the longest repeated suffix of w. 

The notion of closed word is also equivalent to that of periodic-like word [4J. A word w is 
periodic- like if its longest repeated prefix does not appear in w followed by different letters. 

We derive some properties of open and closed trapezoidal words in Section |4j We prove that the 



set of closed trapezoidal words is contained in that of Sturmian words (Proposition 4.5). We then 
characterize the special factors of closed trapezoidal words (Lemma |4.6| ) by showing that every 
closed trapezoidal word w contains a bispecial factor u such that u is a central word and the left 
(resp. right) special factors of w are the prefixes (resp. suffixes) of u. We show that trapezoidal 



palindromes (hence, Sturmian palindromes) are closed words (Theorem 4.9), and that a closed 
trapezoidal word is a Sturmian palindrome if and only if its longest repeated prefix is a palindrome 
(Proposition |4.12[ ) . 

In Section [5] we introduce semicentral words, that are trapezoidal words in which the longest 
repeated prefix, the longest repeated suffix, the longest left special factor and the longest right 



special factor all coincide. We prove in Theorem 5.2 that a trapezoidal word w is semicentral if 
and only if it is of the form w = uxyu, for a central word u and two different letters x,y. 

We then study, in Section |6j the prefixes of the Fibonacci infinite word /, and show that the 
sequence of the numbers of consecutive prefixes of / that are closed (resp. open) is the sequence 



of Fibonacci numbers (Theorem 6.1) 



The paper ends with conclusions and open problems in Section [7j 

2. Trapezoidal words 

An alphabet, denoted by S, is a finite set of symbols, called letters. A word over E is a finite 
sequence of letters from S. The subset of the alphabet S constituted by the letters appearing in w 
is denoted by alph[w). We let Wi denote the i-th letter of w. Given a word w = wiW2 ■ ■ ■ Wn, with 
Wi G T, for 1 < i < n, the nonnegative integer n is the length of w, denoted by |tt;|. The empty 
word has length zero and is denoted by e. The set of all words over S is denoted by S*. The set 
of all words over S having length n is denoted by S". For a letter a G S, \w\a denotes the number 
of a's appearing in w. 

The word w = WnWn-i ■ ■ ■ wi is called the reversal of w. A palindrome is a word w such that 
w = w. In particular, we assume e = e and so the empty word is a palindrome. 

A word z is a factor of a word w ii w = uzv for some u,v G T,* . In the special case u = e 
(resp. V = e), we call z a prefix (resp. a suffix) of w. An occurrence of a factor of a word w is 



internal if it is not a prefix nor a suffix of w. Tlie set of prefixes, suffixes and factors of the word 
w are denoted respectively by Pref{w), SujJ{w) and Fact{w). A border of the word w is any word 
in Pref{w) Ci Suff{w) different from w. 

The factor complexity of a word w is the function defined by fw{n) = \Fact{w) n S"|, for every 
n> 0. 

A factor u of tf is left special (resp. right special) in w if there exist a, 6 G S, a 7^ 6, such that 
au,bu S Fact{w) (resp. ua,ub G Fact{w)). A factor is bispecial in it; if it is both left and right 
special. 

A period for the word w is a positive integer p, with < p < \w\, such that Wj = Wi^p for 
every z = l,...,|i(;| —p. Since |w| is always a period for w, we have that every non-empty word 
has at least one period. We can unambiguously define the period of the word w as the smallest of 
its periods. The period of w is denoted vr^. The fractional root z^ of a word w is its prefix whose 
length is equal to the period of w. 

A word w is a power if there exists a non-empty word u and an integer n > 1 such that w = u^. 
A word which is not a power is called primitive. 

For example, let w = aababba. The left special factors of w are e, a, ab, b and ba. The right 
special factors of w are e, a, a6 and h. Therefore, the bispecial factors of w are e, a, a6 and 6. The 
period of it; is vr^ = 6 and the fractional root of w is z^ = aababb. 

The following parameters were introduced by Aldo de Luca [8]: 

Definition 2.1. Given a word w G S*, i/^ (resp. -fC^) denotes the minimal length of a prefix 
(resp. a suffix) of w which occurs only once in w. 

Definition 2.2. Given a word w & Ti* , L^ (resp. i?^) denote the minimal length for which there 
are no left special factors (resp. no right special factors) of that length in w. 

Example 2.3. Let w = aaababa. The longest left special factor of w is aba, and it is also the 
longest repeated suffix of w; the longest right special factor of w is aa, and it is also the longest 
repeated prefix of w. Thus, we have L^ = 4, Kw = 4, R^ = 3 and H^ = 3. 

Notice that the values H^j, K^, L^ and i?^ are positive integers, unless for the words that are 
powers of a single letter, in which case Rw = L^ = and K^ = H^ = \w\. Moreover, one has 
fw{Rw) = fw{Lw) and max{i?^,K^} = max{L^,if^„} (cf. [8j, Corollary 4.1). 

The following proposition is from de Luca (|8J, Proposition 4.2). 

Proposition 2.4. Let w be a word of length \w\ such that \alph{w)\ > 1 and set ruw = min{i?^, K^} 
and Mu, = max{i?^, i^^}. The factor complexity f^ is strictly increasing in the interval [0, m^], is 
nondecreasing in the interval [mw,Mw] and strictly decreasing in the interval [M^,, {wl]. Moreover, 
for i in the interval [My^, \w\\, one has fw{i -|- 1) = fwii) — 1- If Rw < Kw, then f^ is constant in 
the interval [m,^,Mu,]. 



Proposition 2.4 allows one to give the following definition. 
Definition 2.5. A non-empty word w is trapezoidal if 

* fwii) = i + ^ for < i < rriw, 

• fwii + 1) = fw{i) for ruw <i< M^ - 1, 




Figure 1: The graph of the complexity function of the trapezoidal word w — aaababa. One has mm{Rw,Km} = 3 
and max{i?ii,, A'^,} ~ 4. 



• fwii + 1) = fw{i) - 1 for Myj <i< \w\. 

Trapezoidal words were considered for the first time by de Luca |8j. The name trapezoidal was 
given by D'Alessandro [6]. The choice of the name is motivated by the fact that for these words the 
graph of the complexity function defines an isosceles trapezoid (possibly degenerated in a triangle) . 

Remark 2.6. It follows from the definition that a trapezoidal word is necessarily a binary word, 
since /«;(!), the number of distinct letters appearing in w, must be 1 or 2. 

Example 2.7. The word w = aaababa considered in Example |2 .31 is trapezoidal. See Figure [T] 

In the following proposition we gather some characterizations of trapezoidal words. 

Proposition 2.8. Let w be a word. The following conditions are equivalent: 

1. w is trapezoidal; 

2. |u;| = Lw + Hu,; 



\w\ 



R,, 



K„ 



4. w is binary and has at most one left special factor of length n for every n > 0; 

5. w is binary and has at most one right special factor of length n for every n > 0; 

6. w has at most n + 1 distinct factors of length n for every n > 0; 

7. \fwin + 1) - fwin)\ < 1 for every n > 0. 

Proof. The equivalence of (1), (2), (3), (4) and (5) is in [8] and [6j. The equivalence of (5) and 
(6) follows from elementary considerations on the factor complexity of binary words. Indeed, it is 
easy to see the number of distinct factors of length n + 1 of a binary word w is at most equal to 
the number of distinct factors of length n plus the number of right special factors of length n. The 
equivalence of (7) and (1) comes directly from the definitions and from Proposition 2.4 □ 



From now on, we fix the alphabet S = {a, b}. 

Recall that a word u; G S* is Sturmian if and only if it is balanced, i.e., verifies M. The 
following proposition is from de Luca ([Sj, Proposition 7.1). 



Proposition 2.9. Every Sturmian word is trapezoidal. 



The inclusion in Proposition 2.9 is strict, since there exist trapezoidal words that are not 
Sturmian, e.g. the word w = aaababa considered in Example |2.3[ 

Recall that a word w is rich [15j (or full [1]) if it has \w\ + 1 distinct palindromic factors, 
that is the maximum number of distinct palindromic factors a word can contain. The following 
proposition is from de Luca, Glen and Zamboni (|llj. Proposition 2). 

Proposition 2.10. Every trapezoidal word is rich. 



Again, the inclusion in Proposition 2.10 is strict, since there exist binary rich words that are 
not trapezoidal, e.g. the word w = aabbaa. 

We now discuss the case of non-Sturmian trapezoidal words. We first recall the definition and 
some properties of central words |12j . 



Definition 2.11. A word w G S* is a central word if and only if it has two coprime periods p and 
q and length equal to p + q — 2. 

Proposition 2.12 ( \1Q\ [3]). A word w E S* is a central word if and only if it satisfies one of the 
following equations, for some u,v G T,* and n > 0: 

1. w = a", 

2. w; = 6", 

3. w = uabv = vbau. 

Moreover, if case 3 holds one has that u and v are central words. 

It is known that if w = uabv = vbau is a central word, then the longest between u and v has 
not internal occurrences in vu. Thus, w is a closed word. Note that this is also a consequence of 
our Theorem! 



Proposition 2.13 ([Zj). A word w £ T,* is a central word if and only if w is a palindrome and 
awa and bwb are both Sturmian words. 

Proposition 2.14 (|12j). A word w £ Ti* is a central word if and only if there exists a Sturmian 
word w' such that w is a bispecial factor of w' . 

Recall that a word w is not balanced if and only if it contains a pathological pair {f,g), that 
is, / and g are factors of w of the same length but they do not verify (IT|. Moreover, if / and g 
are chosen of minimal length, there exists a palindrome u such that / = aua and g = bub, for two 
different letters a and b (see [181 Proposition 2.1.3] or [5l Lemma 3.06]). 

We can also state that / and g are Sturmian words, since otherwise they would contain a 
pathological pair shorter than |/| = \g\ and hence w would contain such a pathological pair, 
against the minimality of / and g. So the word n is a palindrome such that aua and bub are 



Sturmian words, that is, by Proposition |2.13[ w is a central word. 
Moreover, we have the following 

Lemma 2.15. Let w be a binary non-Sturmian word and (/, g) a pathological pair of w of minimal 
length. Then there exists a unique central word u such that f = aua and g = bub. 

6 



Proof. Suppose by contradiction that there exists a word u' ^ u, with |u| = \u'\, such that w 
contains as factors aua, bub, au'a and bu'b. Let h be the longest common prefix between u and u' . 
Since u ^ u' there exist distinct letters x, y E S such that hx is a prefix of u and hy is a prefix of u' . 
Then w would contain as factors aha, ahb, bha and bhb. Therefore, w would contain a pathological 
pair {aha, bhb) shorter than {f,g), a contradiction. □ 

The following lemma is attributed to Aldo de Luca in [6] . We provide a proof for the sake of 
completeness. 

Lemma 2.16. Let w he a non-Sturmian word and {f,g) the pathological pair of w of minimal 
length. Then f and g do not overlap in w. 

Proof. By contradiction, suppose that w contains a factor uvz such that f = uv and g = vz, with 
\v\ > 0. Since ||/|a — \g\a\ = \\u\a — \z\a\, then (u,z) would be a pathological pair of w shorter than 
if, 9). □ 

We then have: 

Proposition 2.17. Let w be a binary non-Sturmian word. Then there exists a unique central word 
of minimal length u such that w = viauav2bubv^, for a,b G T, distinct letters and i'i,i'2,?^3 £ S*. 
We call the word u the central root of w . 

The following is the characterization of trapezoidal non-Sturmian words given by D'Alessandro 

Theorem 2.18. Let w be a binary non-Sturmian word. Then w is trapezoidal if and only if 

w = pq, with p G Suff{{zf}*), q G Pref{{zg}*) (4) 

where (/, g) is the pathological pair of w of minimal length. In particular, if w is trapezoidal, then 
L^w = \q\ <ind the longest right special factor of w is the prefix of w of length Rw — 1, that is, the 
longest proper prefix of p. 

Hence, trapezoidal words are either Sturmian or of the form described in Theorem |2.18[ 

Example 2.19. Let w = aaababa be the non-Sturmian trapezoidal word considered in Example 



2.3 We have / = aaa and g = bab, so that zj = a and Zg = ba. The word w can be written as 



w = pq, with p = aaa and q = baba. 



Remark 2.20. From now on, according to Theorem 2.18[ (/, g) is the pair of pathological factor 
of w of minimal length. 

We now show that the words p and q of the factorization Q are Sturmian words. We premise 
the following known result: 

Proposition 2.21 (cf. [lO]). Let w he a finite word and z^ its fractional root. Then w is Sturmian 
if and only if so is every power of z^. 

Lemma 2.22. Let w be a trapezoidal non-Sturmian word. Then the words p and q of the factor- 
ization Q are Sturmian words. 



Proof. Since / and g are Sturmian, by the preceding proposition all powers of Zf and Zg are 
Sturmian as well. Observe that p G Suff{{zf}*) can be restated as p G Pref{{zf}*). As prefixes 
and reversals of Sturmian words are Sturmian, the assertion follows. D 

The following result, due to de Luca, Glen and Zamboni [llj, states that trapezoidal palin- 



dromes are all Sturmian. We give a new proof based on Theorem |2. 18 
Theorem 2.23. The following conditions are equivalent: 

1. w is a trapezoidal palindrome; 

2. w is a Sturmian palindrome. 



Proof. Let w he a trapezoidal palindrome. If w is non-Sturmian we can write, by Theorem 2.18 
w = pq = Vifgv2 = V2gfvi, with ui,U2 € S* such that p = vif and q = gv2. Without loss of 
generality, we can assume \vi\ > \v2\', then p = vif contains / and g as factors, a contradiction 



since, by Lemma 2.22, p is Sturmian. So w cannot be a non-Sturmian word. 



Hence, we proved that trapezoidal palindromes are Sturmian. Since by Proposition 2.9, Stur- 



mian words are trapezoidal, the assertion follows. □ 

3. An enumerative formula for trapezoidal w^ords 

Mignosi |19) proved the following formula (see also [T7]) for the number S{n) of Sturmian words 
of length n: 

n 

S{n) = l + Y,in-i + imi), (5) 

where 6 is the Euler totient function. 



Therefore, by Proposition 2.9, in order to count the number of trapezoidal words of length n, 
it is sufficient to count the number of non-Sturmian trapezoidal words of that length. Recall from 
Theorem |2.18| that every non-Sturmian trapezoidal word w can be uniquely written as u; = pq, 
with p £ Suff{{zf}*) and q £ Pref{{zg}*), with / = aua, g = bub, where u is the central root of w 
and (/, g) is the pathological pair of w of minimal length. Our goal is to count, for a given central 
word u, how many non-Sturmian trapezoidal words exist with central root u and length n. 

Let us premise the following technical lemma, which is a straightforward consequence of The- 
orem [2^ 

Lemma 3.1. Let u be a central word, and x,y G E different letters. The non-Sturmian trapezoidal 
words having central root u are exactly the words of the form pq, for any p £ Suff{{zxux}*) with 
\p\ > \u\ + 2, and q G Pref{{zyuy}*) with \q\ > \u\ + 2. 

Theorem 3.2. For all n > 0, the number of trapezoidal words of length n is 

L("-4)/2j 

l + Y^{n-i + l)(t){i) + ^ 2{n-2i-^)(j){i + 2). 

i=l i=0 

Proof. We first notice that for any word z and any j > 0, there exists a unique word of length j in 



Suff{{z}*) (resp. in Pre/({z}*)). From this remark and Lemma 3.1, we deduce that for any n > 4 



and for any central word u of length < |u| < \_{n — 4)/2j , there are exactly 2(n — 2(|ii| -|- 2) -|- 1) 



2(n — 2\u\ — 3) non-Sturmian trapezoidal words of length n having central root u. Since there are 
(p{i + 2) central words of length i (see |12]), we have that the number of non-Sturmian trapezoidal 
words of length n is 

L(n-4)/2j 

T{n)= Y^ 2(n-2i-3)(/)(i + 2). (6) 

i=0 

Since the number of trapezoidal words of length n is S{n) + T{n), by (pi) and m the statement 
follows. □ 

In Table [T] we give the number of trapezoidal words of each length up to 20. 



n 1 


2 


3 


4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 


TrapHY." 2 


4 


8 


16 28 46 70 102 140 190 250 318 398 496 602 724 862 1018 1192 1382 



Table 1: The number of trapezoidal words of each length up to 20. 



4. Open and closed trapezoidal words 

In this section we derive some properties of open and closed trapezoidal words. Recall that a 
word w is closed if it is empty or it has a factor (different from w) occurring exactly twice in w, as 
a prefix and as a suffix; otherwise it is open. 

Remark 4.1. Let w he a non-empty word over E. The following characterizations of closed words 
follow easily from the definition: 

1. w has a factor occurring exactly twice in w, as a prefix and as a suffix of w; 

2. the longest repeated prefix of w does not have internal occurrences in w, that is, occurs in w 
only as a prefix and as a suffix; 

3. the longest repeated suffix of w does not have internal occurrences in w, that is, occurs in w 
only as a suffix and as a prefix; 

4. the longest repeated prefix of w is not right special in w; 

5. the longest repeated suffix of w is not left special in w; 

6. w has a border that does not have internal occurrences in w; 

7. the longest border of w does not have internal occurrences in w; 

8. w is the complete return to its longest repeated prefix; 

9. w is the complete return to its longest border; 

10. w = uv = zu, with V, z non-empty, and Fact(w) D TiuT, = 0. 

Proposition 4.2. Let w be a word. Then w is open (resp. closed) if and only if w is open 
(resp. closed). 

Proof. Straightforward. D 

The following proposition gives a characterization of open trapezoidal words. 



Proposition 4.3. Let w be a trapezoidal word. Then the following conditions are equivalent: 

1. w is open; 

2. the longest repeated prefix of w is also the longest right special factor of w; 

3. the longest repeated suffix of w is also the longest left special factor of w. 

Proof. (1) =^ (2). Let /i„, be the longest repeated prefix of w and x the letter such that h^x is a 

prefix of w. Since w is not closed, h^ has a second non-suffix occurrence in w followed by a letter 

y. Since /i^ is the longest repeated prefix of w, we have y ^ x. Therefore, hyj is right special in w. 

Suppose that w has a right special factor r longer than h^. Since w is trapezoidal, w has at 



most one right special factor for each length (Proposition 2.8). Since the suffixes of a right special 
factor are right special factors, we have that h^ must be a proper suffix of r. Since r is right special 
in w, it has at least two occurrences in w followed by different letters. This implies a non-prefix 
occurrence oi h^x in w, against the definition of /i^. 

(2) =^ (1). Let hw be the longest repeated prefix of w. Since h^, is a right special factor of w 
there exist different letters a and b such that h^a and h^b are factors of w. This implies that /i^ 
cannot appear in w only as a prefix and as a suffix, that is, w cannot be a closed word. 

(1) <^ (3) can be proven symmetrically. □ 

Proposition 4.4. Let w be a trapezoidal word. If w is open, then Hw = Rw and Ky, = L^. If w 
is closed, then Hyj = K^ and L^ = R^. 

Proof. The claim for open trapezoidal words follows directly from Proposition |4.3[ 

Suppose that w is a closed trapezoidal word. Then H^ = K^ (since w is closed) and therefore 
Lyj = Rw (since w is trapezoidal). □ 

The latter proposition, however, does not characterize open and closed trapezoidal words. 
Actually, one can have Hy^ = K^] = L^ = Rw, as is the case in the word abba, which is closed, and 
also in the word aaba, which is open. 

Open trapezoidal words can be Sturmian (e.g. aaabaa) or not (e.g. aaabab). Closed trapezoidal 
words, instead, are always Sturmian, as shown in the following proposition, that can also be found 
in a paper of the first two authors together with Aldo de Luca (cf. |2], Proposition 3.6). Here we 



report a proof following a totally different approach, based on Theorem 2.18 



Proposition 4.5. Let w be a trapezoidal word. If w is closed, then w is Sturmian. 



Proof. Suppose that w is not Sturmian. Then, by Theorem 2.18 w = pq, p G SujJ{{zf}*), q S 
Pref{{zg}*), with {f,g) being the pathological pair of w of minimal length, K^ = \q\ (and hence 
^w = \p\ since w is trapezoidal) and the longest right special factor of w is the prefix of w of length 
Rw — 1- 

Let ku) be the suffix of q of length K^ — 1, that is, the longest repeated suffix of w. Since w is 
closed, the only other occurrence of k^ in i/; is as a prefix of w. 

If Ru] > Kw, then k^ is a prefix of the longest right special factor of w. This is a contradiction 
with the fact that k^ does not have internal occurrences in w. 

If Rw < Kw, then p is a prefix of k and hence p is a factor of q. This implies that / is a factor of 
q, and therefore q contains both / and g as factors. Hence q would be non-Sturmian, contradicting 
Lemma ESS D 
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As a corollary of Proposition |4.5[ we have that every trapezoidal word is open or Sturmian. 
We now focus on closed trapezoidal words and their special factors. 

Lemma 4.6. Let w be a closed trapezoidal word and u the longest left special factor of w. Then u 
is also the longest right special factor of w (and thus u is a bispecial factor ofw). Moreover, u is 
a central word. 

Proof. Let u be the longest left special factor of w, so that there exist different letters a, 6 G S 
such that au and hu are factors of w. 

We claim that au and bu both occur in w followed by some letter. Indeed, suppose the contrary. 
Then one between au and bu, say au, appears in w only as a suffix. Let /c^ be the longest repeated 
suffix of w. Since u is a repeated suffix of w, we have \kyj\ > \u\. If \kw\ = \u\, then kyj = u and 
since w is closed, u appears in w only as a prefix and as a suffix, against the hypothesis that bu 
is a factor of w. So |A;^| > |«| and therefore au must be a suffix of k^. This implies an internal 
occurrence of au in tu, a contradiction. 

So, there exist letters x, y such that aux and buy are factors of w. Now, we must have x ^ y, 
since otherwise ux would be a left special factor of w longer than u. Thus u is right special in 



w. Since w is closed, we have, by Proposition |4.4[ L^ = Rw, and therefore u is the longest right 
special factor of w. 



By Proposition 4.5 tD is Sturmian. Since u is a bispecial factor of a Sturmian word, we have 



by Proposition |2.14 that li is a central word. D 



Example 4.7. Let w = aababaaba. The longest repeated prefix of w is aaba, which is also the 
longest repeated suffix and does not have internal occurrences, hence tf is a closed trapezoidal 
word. 

The longest left special factor of w is aba, which is also the longest right special factor of w 
and it is a central word. 



By Theorem 2.23, trapezoidal palindromes coincide with Sturmian palindromes, so the next 
results will be stated in terms of Sturmian palindromes but one can replace the word "Sturmian" 
with the word "trapezoidal" . 

Some results on Sturmian palindromes can be found in ^. In particular, we recall here the 
following one. 

Theorem 4.8 ([9], Theorem 29). A palindrome w £ Ti* is Sturmian if and only if its minimal 
period 7r{w) satisfies vr^ = Rw + 1. 



The next theorem shows that Sturmian palindromes, and so, by Theorem 2.23, trapezoidal 
palindromes, are all closed words. 

Theorem 4.9. Let w be a Sturmian palindrome. Then w is closed. 

Proof. By contradiction, suppose that w is open. Then /i^, the longest repeated prefix of w, is 



also the longest right special factor of w (Proposition 4.3). Since w is a. palindrome, we have that 



the longest repeated suffix of w is hw, the reversal of hw In particular, then, K^ = Hu 



By Proposition 4.4, Hy^ = R^ and K^ = L^. Thus we have Hyj = Rw = K^^ = L^i = \w\/2, 

It follows that w = hwxxhw, for a letter x G T,. 



since w is tr apez oidal (see Proposition 



By Theorem 4.8, the period of w is i?^ + 1 = \hujxx\, so that hw = hw Therefore, we have 

(JJ I L'fjjjb Jb I l"UJ • 
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Since h^, is right special in w, there exists a letter y ^ x such that h^y is a factor of w. Now, 
any occurrence of h^y in w cannot be preceded by the letter x, since hyj = hw is the longest 
repeated suffix of w. Thus w contains the factor yh^y- 

Hence, w contains both h^jxx and yh^y as factors, and this contradicts the fact that w is 
Sturmian. □ 




Figure 2: A Venn diagram illustrating the inclusion relations of trapezoidal words. 



From Theorem |4.9| and Lemma 4.6, we derive the following characterization of the special 
factors of Sturmian palindromes. 

Corollary 4.10. Let w he a Sturmian palindrome. Then the longest left special factor of w is also 
the longest right special factor of w and it is a central word. 

Let us recall the following characterization of rich words from |15] : 

Proposition 4.11. A word w is rich if and only if every factor of w which is a complete return 
to a palindrome is palindromic itself. 

We have the following result: 

Proposition 4.12. Let w be a closed trapezoidal word and hw its longest repeated prefix. Then w 
is a Sturmian palindrome if and only if h^ is a palindrome. 



Proof. The word w is a complete return to h^, and it is rich by Proposition 2.10, Since trapezoidal 



palindromes coincide with Sturmian palindromes by Theorem 2.23, the preceding proposition im- 
plies the assertion. □ 

The general structure of a closed trapezoidal word is depicted in Figure [3j 



Remark 4.13. Using Proposition 4.11[ one can prove that Theorem 4.9 can be generalized to rich 
palindromes. Indeed, let w he a rich palindrome, and u the longest border of w. In order to prove 
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h a u u a h 



w 

Figure 3: The structure of a closed trapezoidal word. The longest repeated prefix, h^, is also the longest repeated 
suffix and does not have internal occurrences. The longest left special factor, u, is also the the longest right special 



4.f2| 



factor and it is a central word (Lemma 4.6 1. The word is a palindrome if and only if h^u is a palindrome (Proposition 



that w is closed, it is sufficient to prove that u does not have internal occurrences in w. Suppose 



the contrary. Then w would contain a proper prefix of the form uzu. By Proposition 4.11, uzu is 
a palindrome. Hence, uzu would be a suffix of w, against the maximality of u. 

Notice that the converse is not true. The word w = aaababbaabbabaaa is a closed palindrome 
of length 16 but contains only 16 palindromic factors, so it is not rich. 

5. Semicentral words 

In what follows, we let by /i^, A:u,,r^, /^ respectively denote the longest repeated prefix, the 
longest repeated suffix, the longest right special factor and the longest left special factor of the 
word w. An intriguing class of trapezoidal words is that of words in which all these factors coincide. 

Definition 5.1. Let w he a trapezoidal word. We say that w is semicentral if hw = k^j = r^ = Iw 



Notice that, by Proposition 4.3, semicentral words can also be defined as those open trapezoidal 
words in which the longest repeated prefix is also the longest repeated suffix. 
The following theorem provides a characterization of semicentral words. 

Theorem 5.2. A trapezoidal word w is semicentral if and only if there exists a central word u 
such that w = uxyu with x,y € T,, x ^ y. 

Proof. By ([s]), if w is semicentral we can write w = uxyu, with u = h^ = k^ = r^} = Iw and x,y 
(possibly equal) letters in E. Since u is a bispecial factor of w, it must have a third occurrence in 
w; as u = hyj = kw, such an occurrence cannot be preceded by y or followed by x. We have then 

w = uxyu = ayuxj3 (7) 

where S = {x,x] = {y,y}, q,/3 G S* and \a\ + |/3| = |it|. If a = u, then uxyu = uyux, so that 
x ^ y and yu = uy. Thus u = y" for some n > 0, so that it is a central word. Similarly, if /3 = n 
it follows that x ^ y and u = x" is central. 

Let then < \a\, |/3| < \u\. From (JTJ) it follows that there exists a prefix uix of u such that 
ux = ayuix, and a suffix yu2 of u such that yu = yu2xp. Hence, as \u\ = \a\ + |/3|, we obtain 

u = uixyu2 = U2xyui . 

Again, this implies x ^ y, otherwise the number of occurrences of x in uixyu2 and U2xyui would 



be different. Thus, u = uixyu2 = U2yxui is central by Proposition 2.12[ 

Conversely, let u he a central word and w = uxyu, with S = {x,y}. We claim that xuy 
occurs in w. This is clear if n is a power of a; or y; by Proposition |2.12[ the other possibility is 
u = uixyu2 = U2yxui for some words ui,U2, so that 



w = uxyu = U2y ■ xuixyu2y ■ xui 
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gives the desired occurrence of xuy. Hence u is a right special factor and a repeated sufhx of w, so 
that Rw > \u\ + 1 and Kw > |n| + 1. Since \w\ = 2{\u\ + 1) < Rw + K^, it foUows from ^ that w 
is trapezoidal and k^ = ryj = u. Analogously, since u is a left special factor and a repeated prefix 
of w, one proves that h^ = 1^ = u. □ 



By Proposition 2.12 , every central word u different from a power of a single letter can be written 



as « = u\xyu2 = U2yxui for central words ui and U2 and distinct letters x, y. Theorem |5 . 2| provides 
a similar characterization for semicentral words (and this motivates the choice of the name). 

It is also worth noticing that semicentral words are non-strictly bispecial Sturmian words (see 
|13j). that is, for any semicentral word w one has that aw, bw, wa and wb are all Sturmian words, 
but nevertheless either awb or bwa is not Sturmian. 

Example 5.3. The semicentral words of length 8 are: aaaabaaa, aaabaaaa, abaababa, ababaaba, 
bababbab, babbabab, bbbabbbb and bbbbabbb. 



As a consequence of Theorem 5.2, we have that the number SC{n) of semicentral words of 
length n is: 

, - _ J 0, if n is odd, 

^'^y'^) = \ 20(5 + 1), ifnisevem 

6. Open and closed prefixes of the Fibonacci word 

In this section, we give an interesting example of the open/closed dichotomy in trapezoidal 
words, by analyzing prefixes of the Fibonacci infinite word under this new light. The Fibonacci 
infinite word / is the word 

/ = abaababaabaababaababaabaaba ■ ■ ■ 

obtained as the limit of the substitution a i— )• ab, b >-^ a. Note that every prefix of the Fibonacci 
word is Sturmian, and therefore trapezoidal. 

We investigate which prefixes of / are open and which are closed. If one writes up the list of the 
prefixes of / for each length (see Table [6]) , can notice that the sequences of "open" and "closed" 
alternate following the Fibonacci sequence Fi (defined by Fi = F2 = 1 and Fj+2 = -Fi+i + -Fj for 
every i > 1). More precisely, the numbers of consecutive prefixes that are closed (resp. open), that 
is, the lengths of the runs of "c" and "o" in Table |6j are the Fibonacci numbers. In Theorem |6.1| 
we show this fact. 

n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

fl...n COCOCCOOCCCOOOCCCCCO 

Table 2; Which prefixes of the Fibonacci word are closed and which are open. 

Theorem 6.1. Let w be a prefix of the Fibonacci word f. Then w is open if and only if there 
exists i such that Fj+i — 1 < |w| < 2Fj — 2. 
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Proof. First, let us recall some well known properties of the Fibonacci word. Let S = {e, a, aba, abaaba, abaababaa^ 
be the set of palindromic prefixes of /. It is well known that the elements of S are the prefixes Si 
of / such that \si\ = Fi — 2, i > 3. The words in S are all central words. Also, for every i > 4 we 
have 



Si+l 

where xi = b, yi = a for even i, and Xi = a, yi 
Our assertion is trivially verified if \w\ < 3 
i > 4, a prefix w of f with -Fj+i — I < \w\ < F^ 



b for odd i. 
Fr, — 2. Hence we only need to show that for all 



1 is 



1. open if i^j+i — 1 < \w\ < 2Fi — 2, and 

2. closed otherwise, i.e., when 2Fi — 1 < \w\ < Fi^2 



2. 



Let us fix i > 4 and set x = Xj, y = y^ as in (jsl). We know that the prefix of / of length Fj+i — 2 



is Sj+i. By Theorem 4.9 it is a closed word and its longest repeated prefix is Sj. Moreover, Si is 



also the longest repeated prefix of the semicentral word Sixysi, that is the prefix of / of length 
2Fi — 2. Therefore, Sj is the longest repeated prefix for all tu's in between, i.e., for each w E Pref{f) 
with -Fj+i — 1 < lii^l < 2Fi — 2. As Si has an internal occurrence in all such prefixes, they are all 
open, proving point 1. 

It remains to show that any prefix of 

Si+2 = Sixysi+i = Sixysixysi-i 

longer than Sixysi is closed. Indeed, such words have a border which is strictly longer than Sj, and 
which cannot have any internal occurrences for otherwise SiX would have a second occurrence in 
Sixysi. □ 



prefix of / 


length 


open/closed 


example 


Sixysi-i = Sj+i 


F^+i - 2 


closed 


abaababaaba 


Sixysi-iy 


Fi+i - 1 


open 


abaababaabaa 


Sixysi-iyx 


Fi+i 


open 


abaababaabaab 


Sixysi 


2Fi-2 


open 


abaababaabaaba 


SixysiX 


2Fi-l 


closed 


abaababaabaabab 


Sixysixy 


2F, 


closed 


abaababaabaababa 


Sixysi+i = Sj+2 


Fi+2 - 2 


closed 


abaababaabaababaaba 


Sixysi+ix 


Fi+2 - 1 


open 


abaababaabaababaabab 



Table 3: The structure of the prefixes of / with respect to the palindromic prefixes Si. 

Corollary 6.2. The longest word in a run of closed prefixes of f is a central word. The longest 
word in a run of open prefixes of f is a semicentral word. 
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7. Conclusions and open problems 

In this paper we have investigated trapezoidal words, which are a natural generalization of 
finite Sturmian words. We exhibited a formula for counting trapezoidal words and studied open 
and closed trapezoidal words separately. 

Trapezoidal words form a subclass of the class of rich words, that are words containing the 
maximum number of palindromic factors. A challenging problem is that of finding an enumerative 
formula for rich words. This problem is still open, even in the binary case. We think that separating 
rich words in open and closed could give further insights on these words. 

More generally, we think that the open/closed dichotomy can be useful in the study of other 
classes of words. For instance, it would be interesting to generalize the results obtained in Section [6] 
for the Fibonacci word to the class of standard words. 
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